-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libc] More memory allocation changes for 8086 toolchain #2140
Conversation
This is massive! Cool. |
So this will allow ELKS native apps to use more than 64kb? |
When using OW toolchain (which allows large memory model), yes. With other toolchains @ghaerr will know. |
All applications built with OWC in large model have the ability to use more than 64k code or data. With an 8086, up to 64k of code can be accessed with the CS register, and likewise an additional 64k of data can be accessed with the DS register. In small model, the CS and DS registers are set once at program startup and thus the program size is limited to 64k code and 64k data. In large model, a data pointer is 32 bit and holds a separate DS value in the top 16 bits along with the lower 16 bits pointing to 64k of data within that "segment". Thus, it is possible that a far pointer can actually use one of 2^16 = 65536 segments, each pointing to 64k of data (Yes, these segments would overlap, that's another discussion). The issue this PR fixes has to do with the notion of the "default data segment". That is, even with large model programs where the DS register can be set to point to anything, there is only one "default data segment" where the program stack, statically declared data and string literal variables reside. It fills up quickly with big programs. In addition, the normal "heap" is contained in this same default data segment, and uses the remaining space left up to 64k AFTER all the stack, data and literals are added. That can be quite small with large programs, and that's the case we have with the large toolchain programs. The normal storage allocator, malloc(), allocates data only from the default data segment, that's the big problem. One can use fmemalloc to allocate from anywhere in memory, but it costs 16 bytes per allocation in the kernel near data segment, which is also subject to the same 64k limitations. In the 8086 toolchain, the default data segment was pretty filled up. What the arena allocator does is allow the default memory allocator (malloc) to allocate from outside the default data segment - out of a separate, new data segment which is 64k to start. So there's lots more slots available to fill the hundreds or possibly thousands of allocations the toolchain requires. This particular first implementation only allows a single separate 64k "arena". We will see if that is enough. I have plans to expand it to allow an unlimited number of additional 64k "heap arenas" from which to allocate memory from if this is not enough. The overhead for an allocation in the arena allocator is only two bytes, versus 16 for fmemalloc. |
@ghaerr, should I PR the memory changes in the toolchain using this new memory allocation strategy, or wait you to PR first so there is no merge conflicts? |
Let me take a first pass at it so I can double check that the system actually works. I’ll do so later today. Thanks!
On Dec 17, 2024, at 11:47 AM, Rafael Diniz ***@***.***> wrote:
@ghaerr<https://github.com/ghaerr>, should I PR the memory changes in the toolchain using this new memory allocation strategy, or wait you to PR first so there is no merge conflicts?
—
Reply to this email directly, view it on GitHub<#2140 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AC3OFZKY35NW3E6ZKORYMCT2GBWVNAVCNFSM6AAAAABTXTL72SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNBZGMZDEMBVG4>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Last of bug fixes and enhancements to our new arena-based malloc for the 8086 toolchain. These enhancements currently only work for OpenWatcom C and large/compact model (i.e. 32-bit data pointers).
@rafael2k, The whole arena malloc is a bit complicated to explain, but basically a single wrapper file, mem.c (see below) will be used in each tool to fully encapsulate malloc, realloc and free. There will/can be no ifdefs or renames of malloc to memalloc in each tool, that will all have to be removed or replaced back to what it originally was. (Easily done by just deleting the ifdef ELKS portions entirely). The reason for this is that the C library routines themselves call malloc, and we need replace all memory allocation calls, as a C library returned-from-malloc pointer could be passed around inside a tool and then passed to our renamed malloc, which would cause a problem. With the full wrapper, even the C library routines that call malloc end up in our allocator. There are lots of other linker issues with the multiple memory allocators we now have available, but I think I've got it all straightened out to normally work and link automatically.
So, long story short, the following file mem.c will be added to each tool's project Makefile. Hopefully we will not need customization of the arena vs fmemalloc threshold (MALLOC_ARENA_THRESH) since the default small heap is 64K, but this can now be done programmatically using an extern int malloc_arena_thresh instead of changing mem.c source code. For now, the arena allocator allocates a maximum 65520 bytes from main memory, and then subdivides that for any allocations <= 1K bytes each, the rest going to fmemalloc. The other good news is that by setting the following in a shell script before running a tool, a full heap analysis showing every allocation will be dumped:
Since it's a bit complicated at the moment, I'll try to take a first pass by pulling down your repo (I've saved my compiler bug fixes for later) and see if I can get all the tools running with the new allocator. I'll then post a PR for your review and we can go from there with regards to tuning.
The following 'mem.c' file wrapper must be included in each tool, as this can't go in the C library directly. The net effect is to force malloc, free and realloc from the tool or the C library to use the arena allocator and fmemalloc, instead of the default malloc/free/realloc.
If you have more questions, please ask, thanks!